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Abstract. Logical reasoning about program data often requires dealing 
with heap structures as well as scalar data types. Recent advances in 
Satisfiability Modular Theory (SMT) already offer efficient procedures 
for dealing with scalars, yet they lack any support for dealing with heap 
structures. In this paper, we present an approach that integrates Sepa- 
ration Logic — a prominent logic for reasoning about list segments on the 
heap — and SMT. We follow a model-based approach that communicates 
aliasing among heap cells between the SMT solver and the Separation 
Logic reasoning part. An experimental evaluation using the Z3 solver 
indicates that our approach can effectively put to work the advances in 
SMT for dealing with heap structures. This is the first decision procedure 
for the combination of separation logic with SMT theories. 



1 Introduction 

Satisfiability Modulo Theory (SMT) solvers play an important role for the con- 
struction of abstract interpretation tools [11, 12]. They can efficiently deal with 
relevant logical theories of various scalar data types, e.g., fixed length bit- vectors 
and numbers, as well as uninterpreted functions and arrays [1,7, 14, 15, 19]. How- 
ever, dealing with programs that manipulate heap-allocated data structures us- 
ing pointers exposes limitations of today's SMT solvers. 

For example, SMT does not support separation logic — a promising logic 
for dealing with programs that manipulate the heap following a certain dis- 
cipline [24] . Advances in the construction of such a solver could directly boost a 
wide range of separation logic based verifiers: manual/tool assisted proof devel- 
opment [18,20,25], extended static checking [5,17], and automatic inference of 
heap shapes [2,8,16,26]. 

In this paper we present a method for extending an STM solver with sepa- 
ration logic with list segment predicate [3], which is a frequently used instance 
of separation logic used by the majority of existing tools. Our method decides 
entailments of the form 77 A S — » 77' A £' '. Here, 77 and 77' are arbitrary theory 
assertions supported by SMT, while S and S' are spatial conjunctions of pointer 
predicates next(x,y) and list segment predicates \seg(x,y). Symbols occurring in 
the spatial conjunctions can also occur in 77 and 77'. 

The crux of our method lies in an interaction of the model based approach to 
combination of theories [13] and a so-called match function that we propose for 
establishing logical implication between a pair of spatial conjunctions. We use 



models of 77, which we call stacks, to guide the process of showing that every 
heap that satisfies £ also satisfies £' . In return, the match function collects an 
assertion that describes a set of stacks for which the current derivation is also 
applicable. This assertion is then used to take those stacks into account for which 
we have not proved the entailment yet. As a result, our method can benefit from 
the efficiency offered by SMT for maintaining a logical context keeping track of 
stacks for which the entailment is already proved. 

In summary, we present (to the best of our knowledge) the first SMT based 
decision procedure for separation logic with list segments. Our main contribu- 
tion is the entailment checking algorithm for separation logic combined with 
decidable theories, together with its correctness proof. Furthermore we provide 
an implementation of the algorithm using Z3 for theory reasoning, and an eval- 
uation on micro-benchmarks. 

The paper is organised as follows. A run of the algorithm is illustrated in 
Section 2. We give preliminary definitions in Section 3. Our method is described 
in Section 4. All proofs are presented in Section 5. We present an experimental 
evaluation in Section 6. Conclusions are finally presented in Section 7. 

Related work Our method is directly inspired by a theorem prover for sepa- 
ration logic with list segments [23] based on paramodulation techniques [22] to 
deal with equality reasoning. An approach that turned out quite advantageous 
compared to SmallFoot-based proof systems previously developed. 

While [23] only deals with equalities, the work in this paper supports arbi- 
trary SMT theory expressions in the entailment. Theory extensions of paramod- 
ulation are still an open problem — even state-of-the-art first order provers de- 
liver poor performance on problems with linear arithmetic so it is not evident 
how to extend [23] with theory reasoning. Similarly, it is unclear how to extend 
SmallFoot or jStar to obtain a decision procedure with rich theory reasoning. 

Our match function can be seen as a generalisation of the unfolding inferences, 
geared towards interaction with the logical context of an SMT solver, rather than 
literals in a clausal representation of the entailment problem. Last but not least, 
on that previous work the combination with paramodulation is given by a quite 
complex inference system, at a level of detail which would not accessible through 
a black-box SMT prover. The original proof system for list segments [3, 4] gives 
a starting point to the design of our match function. However, while the proof 
system needs to branch and perform case reasoning during proof search, the 
match function is a deterministic, linear pass over the spatial conjuncts. 

Recently, entailment between separation logic formulas where 77 and II' 
are conjunctions of (dis-)equalities was shown to be decidable in polynomial 
time [10]. While we are primarily interested reasoning about rich theory asser- 
tions describing stacks, exploration of this polynomial time result is an interest- 
ing direction for future work. Regarding an Nelson-Oppen combination of deci- 
sion procedures [21], we see an algorithm following this combination approach as 
an interesting and difficult question for the future work. A direct application of 
such theory combination does not work, since it requires a satisfiability checker 
for sets of (possibly negated) spatial conjunctions. The interplay of conjunction, 
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negation and spatial conjunction is likely to turn this into a PSPACE problem. 
In contrast, the spatial reasoning in our approach has linear complexity, thus 
shifting the computational complexity to the SMT prover instead. 

Chin et al. [9] present a fold/unfold mechanism to deal with user-specified 
well-founded recursive predicates. Due to such a general setting, it does not 
provide completeness. Our logic is more restrictive, allowing to develop a com- 
plete decision procedure. Similarly, Botincan et al. [6] rely on a SmallFoot based 
proof system which, although does not guarantee completeness on the fragment 
we consider, is able to deal with user provided inference and rewriting rules. 



2 Illustration 

In this section we illustrate our algorithm using a high-level description and a 
simple example. To this end we prove the validity of the entailment: 

c < e A lseg(a, b) * lseg(a, c) * next(c, d) * lseg(d, e) A lseg(6, c) * lseg(c, e) . 

n * £ ' n ' * ~z> ' 

Abstractly, the algorithm performs the following key steps. It symbolically 
enumerates models that satisfy 77 and yield a satisfiable heap part for S in the 
antecedent. For each such assignment s the algorithm attempts to (symbolically) 
prove that each heap h satisfying the antecedent, i.e., s, h \= II A S also satisfies 
the consequence, i.e., s, h |= 77' AS'. Finally, we generalise the assignment s and 
use the corresponding assertion to prune further models of 77 that would lead 
to similar reasoning steps as s. The entailment is valid if and only all models of 
the pure parts are successfully considered. 

For our example we begin with the construction of the constraint that guaran- 
tees the satisfiability of the heap part of the antecedent. This constraint requires 
that each pair of spatial predicates in E is not colliding, i.e., if two predicates 
start from the same heap location then one of them represents an empty heap. A 
list segment, say lseg(a, 6), represents an empty heap if its start and end locations 
are equal, i.e., if a~6. A points-to predicates, say next(c, d), always represents 
a non-empty heap. For the predicates lseg(a, b) and lseg(<7, e) the absence of col- 
lision is represented as a~rf— >a~6Vd~e, i.e., if the start location a of the 
first predicate is equal to the start location d of the second predicate then either 
of the predicates represents an empty heap. The remaining pairs of predicates 
produce the following non-collision assertions. 

ac±a^ac±b\/ac±c lseg(a, b) and next(a, c) 

a~c->o~i)Vl lseg(o, b) and next(c, d) 

ac^d— > ac^bV dc^e lseg(a, b) and lseg(d, e) 

a~c^>-a~c\/A- lseg(a, c) and next(c, d) 

ac^d— >fl~cVd~e lseg(a, c) and lseg(d, e) 

c~d^lVd~e next(c, d) and lseg(<7, e) 

We refer to the conjunction of the above assertions as well-formed(S) . 
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Next, we use an SMT solver to find a model for 77 A well-formed(E) . If no 
such model exists the entailment is vacuously true. For our example, however, 
the solver finds the model s = {a i-> 0, b i->- 0, c i->- 0, d h-> 1, e i-> 1}. 

We then symbolically show that for every heap h model of E is also a model 
of E' . We do this by showing that E and E' are matching, i.e., for each predicate 
in E' there is a corresponding 'chain' of predicates in E. The chain condition 
requires adjacent predicates to have a location in common, namely, the finish 
location of a predicate is equal to the start location of the next with respect to s. 

Since matching only needs to deal with predicates representing non-empty 
heaps, we first normalise E and E' by removing spatial predicates that are empty 
in the given model s, i.e., we remove each list segment predicate whose start and 
finish locations are equal with respect to s. From E we remove lseg(a, b) since 
s(a) = s(b) = 0, and from E 1 we cannot remove anything. 

Now we attempt to find a match for lseg(o,c) G E' in the normalised an- 
tecedent lseg(«,e) * next(c, d) * \seg(d, e). The chain should start with lseg(«,e) 
since s(a) — s(b). Since lseg(a, c) finishes at the same location as lseg(6, c) in 
every model, we are done with the matching for lseg(6, c). Since lseg(a, c) was 
used to construct a chain, we cannot consider it in the remaining matching steps 
(but only for the same model s). Next we compute matching for lseg(c, e) G E' 
using the remaining predicates next(c, d) * lseg(d, e) from E. We begin the chain 
using next(c, d) since it has the same start location as lseg(c, e). Since the finish 
location of next(c, d) is not equal to e with respect to s we still need to connect 
d and e. We perform this connection by an additional matching request that 
requires to match lseg(rf, e) using the remaining predicates from E, i.e., using 
only lseg(d, e). Fortunately, this matching request can be trivially satisfied. Since 
all predicates of E' are matched, and all predicates in E were used for matching, 
we conclude that E and E' exactly match with respect to the current s. 

The algorithm notices that from the model s only the assertion a ~ b was nec- 
essary to perform the matching. Hence, the model s is generalised to the assertion 
U = (a~ b). We continue the enumeration of pure models for the antecedent, ex- 
cluding those where a ~6. The SMT solver reports that 77 Awell-formed(E) A -if/ 
is not satisfiable. Hence we conclude that the entailment is valid. 

3 Preliminaries 

We write / : X — > Y to denote a function with domain X = dom / and range Y; 
while /: X — Y is a partial function with dom / C X. We write /i *•■•*/„ to 
simultaneously denote the union /i U • • • U /„ of n functions, and assert that their 
domains are pairwise disjoint, i.e. domhi n dom/ij = when i =^= j. Given two 
functions / : Y — > Z and g : X — > Y, we write / o g to denote their composition, 
i.e. (/ o g)(x) = f(g(x)) for every x G domg. We sometimes write functions 
explicitly by enumerating their elements, for example / = {a n> b, b i-> c} is the 
function with dom / = {a,b} and such that /(a) = b and f(b) — c. 

Syntax of separation logic We assume a sorted language with both theory 
and uninterpreted symbols. Each function symbol / has an arity n and a signa- 



4 



ture / : n x • ■ • x r„ — > r, taking n arguments of respective sorts tj and returning 
an expression of sort r. A constant symbol is a 0-ary function symbol. A variable 
is an uninterpreted constant symbol, and Var denotes the set of all variables in 
the language. Constant and function symbols are combined as usual, respecting 
their sorts, to build syntactically valid expressions. We use x: r to denote an 
expression x of sort r, and C to denote the set of all expressions in the language. 

We assume that, among the available sorts, there are Int and Bool for, respec- 
tively, integer and boolean expressions. We refer to a function symbol of boolean 
range as a predicate symbol, and a boolean expression as a formula. We also 
assume the existence of a built-in predicate ~ : r x r — > Bool for testing equality 
between two expressions of the same sort; as well as standard theory symbols 
from the boolean domain, that is: conjunction (A), disjunction (V), negation (-1), 
truth (T), falsity (_L), implication (—>•), bi- implication (<H>) and first order quan- 
tifiers (V, 3). Theory symbols for arithmetic may also be present, and we use nil 
as an alias for the integer constant 0. 

Additionally, we also define spatial symbols to build expressions that de- 
scribe properties about memory heaps. We have the spatial predicate symbols 
emp: Bool, next: Int x Int — > Bool and Iseg: Int x Int — > Bool for, respectively, 
the empty heap, a points to relation, and acyclic-list segments; their semantics 
are described in the following section. Furthermore, we also have the symbol for 
spatial conjunction * : Bool x Bool — > Bool. A formula or an expression is said to 
be pure if it contains no spatial symbols. 

Although in principle one can write spatial conjunctions of arbitrary boolean 
formulas, in our context we only deal with the case where each conjunct is 
a spatial predicate. So when we say a "spatial conjunction" what we actually 
mean is a "spatial conjunction of spatial predicates" . Furthermore, at the meta- 
level, we treat a spatial conjunction E = Si a multi-set of boolean 

spatial predicates, and write \E\ = n to denote the number of predicates in 
the conjunction. In particular we use set theory symbols to describe relations 
between spatial predicates and spatial conjunctions, which are always to be 
interpreted as multi-set operations. For example: 

next(y, z) <E lseg(x, y) * next(y, z) 
next(x,y) * next(x,y) % next(x,y) 
emp * emp * emp \ emp = emp * emp . 

Semantics of separation logic Each sort r is associated with a set of values, 
which we also denote by r, usually according to their background theories; e.g. 
Int = {. . . , -1, 0, 1, . . .}, and Bool = {_L, T}. We use Val = t\ ttl • • • l±lr n to denote 
the disjoint union of all values for all sorts in the language. 

A stack is a function s: Var — > Val mapping variables to values in their 
respective sorts, i.e. for a variable v: t we have s(v) E t. The domain of s is 
naturally extender over arbitrary pure expressions in C using an appropriate 
interpretation for their theory symbols, e.g. s(l + 2) =3. In our context, a 
heap corresponds to a partial function h: Int — 1 Val mapping memory locations, 
represented as integers, to values. 
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1: function prove(n A X -> II' A X") 

2: r := n A well-formed(E) 

3: while exists s |= _T do 

4: U := match(s,E,E,E') 

5: if s \£ il' A (7 then return invalid 

6: r := r A -.(77' A t/) 

7: return valid 

8: function match(s, E, X, X') 

9: if exists S € E such that s |= empty(S) 
10: return empty(S) A match(s, X, X \ S, X') 

11: if exists S 1 ' 6 X' such that s |= empty(S') 
12: return empty(S') A match(s, X, E, E' \ S') 
13: if exists 5" £ 17, 5" £ X" such that s \= match-step(E , S, 5') 
14: return match-step(E , S, S') A match(s, E,E\S, (E' \ S') * residue(S, S')) 
15: else 

16: return (X 1 = 0) A (X' = 0) 

Fig. 1. Model driven entailment checker 

Given a stack s, a heap h, and a formula F we inductively define the satis- 
faction relation of separation logic, denoted s,h \= F, as: 

s,h\= II if II is pure and s(77) = T, 

s, /i |= emp if h = 0, 

s, /i |= next(x, y) if /i = ^A s(y)}, 

s, h \= F\ * F2 if h = hi * I12 for some hi and hi 

such that s, hi |= Fi and s, /i 2 |= -^2- 

Semantics for the acyclic list segment is introduced through the inductive 
definition lseg(a;, z) = (i~zA emp) V (x^z A 3y. next(x, y) * \seg(y, z)). As an 
example consider {x i-> 1, y i-> 2}, {1 H 3,3 H> 2} |= lseg(x, y). 

When s, ft |= f we say that the interpretation (s, ft,) is a model of the formula 
F. A formula is satisfiable if it admits at least one model, and valid if it is satisfied 
by all possible interpretations. Note, in particular, that an entailment F — > G is 
valid if every model of F is also a model of G. Finally, for a formula F we write 
s \= F if it is the case that, for every heap h, we have that s,h\= F holds. 

Note that nil is not treated in any special way by this logic. If one wants nil 
to regain its expected behaviour, i.e. nothing can be allocated at the nil address, 
it is enough to consider next(nil,0) * F, where F is an arbitrary formula. 

4 Decision procedure for list segments and SMT theories 

In this section we define and describe the building blocks that, when put together 
as shown in the prove and match procedures of Figure 1, constitute a decision 
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procedure for entailment checking. The procedure works for entailments of the 
form 77 A £ — > W A S 1 , where both 77 and 77' are pure formulas, with respect to 
any background theory supported by the SMT solver, and both £ and £' are 
spatial conjunctions. 

To abstract away the specifics of a spatial predicate S, we first define addr(S) 
and empty(S) — respectively the address and the emptiness condition of a given 
spatial predicate — as follows: 



S addr(S) empty(S) 

emp T 

next(a;, y) x _L 

\seg{x,y) x x~y 



Intuitively, if the emptiness condition is true with respect to a stack-model s, the 
portion of the heap-model that corresponds to S must be empty. Alternatively, 
if the emptiness condition is false with respect to s, the value associated with its 
address must occur in the domain of any heap satisfying the spatial predicate. 
Formally: given s |= empty(S) for a stack s, we have s,h\= S if, and only if, the 
heap h = 0; and if s, h |= ^empty(S) A S then, necessarily, s(addr(S)) E domft. 

Well-formedness Before introducing the well-formed condition, occurring at 
line 2 of the algorithm in Figure 1, we first define the notion of collision between 
spatial predicates. Given any two spatial predicates S and S', the formula 

collide(S, S') = ->empty(S) A -<empty(S ) A addr(S) ~ addr(S') . 

states that two predicates collide if, with respect to a stack-model, they are both 
non-empty and share the same address. This would cause a problem if both S 
and S' occur together in a spatial conjunction, since they would assert that the 
same address is allocated at two disjoint — separated — portions of the heap. 

Given a spatial conjunction S = S\ * ■ ■ ■ * S n , the well-formedness condition 
is defined as the pure formula 

well-formed(E) — f\-icollide(Si, Sj) , 

l<i<j<n 

stating that no pair of predicates in the spatial conjunction collide. As an exam- 
ple consider the spatial conjunction 

S = next(.x, y) * lseg(x, z) * next(w, z) 

Si S2 S3 

we obtain 

collide(Si, S2) 
collide(Si, S3) 
collide(S2, S3) 
well-formed(S) 
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= (TAi^zAi^i) = (x^z) 

= (TMAi~iij) = (a; ~ w) 

= (x^zATAi^w) = {x^kz /\x~w) 

= -i(x i± z) A -1(2; ~ w) A -*(x qkz hx^w) = (2 ~ z V x^kw) . 



That is, the formula is well-formed only when x ~ z, so that the second predicate 
is empty, and x^w, so that the first and third do not collide. In general, the 
well-formedness condition is quite important since, as the next theorem states, 
it characterises the satisfiability of spatial conjunctions. 

Theorem 1. A spatial conjunction £ is satisfiable if, and only if, the pure for- 
mula well-formed(U) is satisfiable. 

Matching step We now proceed towards the introduction of the match-step 
condition, used at line 14 in Figure 1, which lies at the core of our matching 
procedure. For this we first define, given a spatial conjunction S = Si 
and an expression x, the allocation condition 

alloc(E, x) = \J -^empty(Si) Ai~ addr(Si) 

l<i<n 

which holds, with respect to a stack-model s, when a corresponding heap-model h 
for E would necessarily have to include s(x) in its domain. Continuing from our 
previous example we have that 

alloc(S, z) — (T A z ~ x) V (x q£ z A z ~ x) V (T A z ~ w) — (z ~ x V z ~ w) . 

That is, the value of z must be allocated in the heap if cither z ~ x, so it is 
needed to satisfy next(x,y), or z~u> and it is needed to satisfy next(w,z). If 
otherwise the allocation condition is false, although it may occur, there is no 
actual need for z to be allocated in the domain of the heap. 

Now, when trying to prove an entailment s \= S — > S' , we want to show that 
any heap model of S is also a model of £' . Thus, if we find a pair of colliding 
predicates S £ £ and S' G then portion of the heap that satisfies S should 
overlap with the portion of the heap that satisfies S'. In fact, it is not hard 
to convince oneself — for the list segment predicates considered — that the heap 
model of S' should match exactly that of S plus some extra surplus. 

In the following definitions residue gives the precise value of the extra surplus, 
while enclosed specifies additional conditions which are necessary so that the 
model of S doesn't leak outside the model of S'. 



S' 




S 


residue(S, S') 


enclosed(£, S, S') 


next (a;' 


z) 


next(x, y) 


emp 


y~z 


lseg(x / , 


z) 


next(x, y) 


lseg(y, z) 


T 


next(a;' 


z) 


\seg(x,y) 


emp 


_L 


\seg(x', 


z) 


\seg(x,y) 


lseg(y,z) 


y^z~> alloc(£, z) 



The matching step condition is the formula 

match-step(£ ', S, S ) = collide(S, S') A enclosed(£ ', S, S ) . 

To formalise our stated intuition, the following proposition articulates how 
the residue that is computed between two colliding predicates is indeed satisfied 
by the remaining heap surplus. The validity of this statement, as in the case 
of the subsequent two propositions, can be easily verified by inspection of the 
relevant definitions. 
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Proposition 1. Given two spatial predicates S, S' , a stack s \= collide(S, S') 
and a heap h such that s,h \= S' , if there is a partition h = h\ * hi for which 
s,hi |= S, it necessarily follows that s,h 2 |= residue(S, S') . 

Moreover, for any stack satisfying the matching step condition, we are free 
to replace S' in E' with the matched expression S * residue(S, S'). Formally we 
state the following proposition. 

Proposition 2. Given a stack s \= match- step(E,S,S'), where S and S' are 
spatial predicates, and S occurs in the spatial conjunction E , for any spatial 
conjunction E' containing S' we have that 

s \= (E' \S')*S* residue(S, S') -> E' 

Finally, we state that the enclosing condition is complete in the sense that, 
if it were not satisfied by a stack s, then one could build a counterexample for 
the matching S * residue{S 1 S') — > S'. 

Proposition 3. Given two spatial predicates S, S' , a spatial conjunction E that 
contains S , a stack s and a two-part heap h — hi * hi such that s, hi * hi \= E 
and s,hi \= S * residue(S, S'), if s (= collide(S 1 S') A ^enclosed(E, S, S'), then 
there is a h' 2 such that s, hi * h' 2 \= E but s, h' 2 \£ S * residue(S, S') — > S' . 

As an example consider the case where S = \seg(x,y) and S' = lseg(x',z), 
such that residue(S, S') = lseg(y, z). Take some stack s \= collide(S, 5") and the 
heap h 2 = {s(x) n> s(y) 7 s(y) i-> s(z)} as a model of \seg(x,y) * lseg(y, z). From 
s \= ^enclosed(E , 5, S') it follows that s(x) ^ s(y) and the address s(z) does 
not need to be allocated anywhere in h = hi * hi. This allows us to patch and 
let h' 2 = {s(x) i-> s(z),s(z) i v s(y),s(y) i-> s(z)}, which is still a model of the 
pair lseg(x, y) * lseg(y, z) but — due to the introduced cycle — not of lseg(x', z). 

Matching and proving To finalise the description of our decision procedure 
for entailment checking we have only left to put all the ingredients together, as 
shown in Figure 1, into the match and prove functions. 

The match function tries to establish whether s \= E — > (E\E) * E'. Initially 
called with E set to E, at the top level this is in fact equivalent to checking the 
validity of s \= E — > E' . During the execution process E will retain its initial 
value, E and E' carry the portions of the entailment that are left to match, 
while E \ E is the fragment already matched. As the function progresses, the 
conjunctions E and E' will become shorter, while the matched portion E \ E 
grows. If successful both E and E' will become empty, yielding at the end the 
trivial entailment s \= E — >• E. 

The function begins by inspecting E and E' to discard, at lines 10 and 12, any 
empty predicates with respect to s, and recursively calling itself to verify the rest 
of the entailment. After removing all such empty predicates, if a valid matching 
step is found, the predicate S' occurring in E' is replaced with 5* residue(S, S"), 
so that S — which now occurs both in E and E' — can be moved to the matched 
part of the entailment in the recursive call at line 14. 
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If the function is successful, after reaching the bottom of the recursion at 
line 16 with both E and E' becoming empty, the return value collects a conjunc- 
tion of all assumptions made on the values of stack. This allows to generalise the 
proof which works not only for the particular stack s, but for any stack satisfying 
the same assumptions. Otherwise, if the bottom of the recursion is reached with 
some portions still left to match, the function returns an unsatisfiable formula 
signalling the existence of a counterexample for the entailment. This behaviour 
is formalised in the following theorem, proved later in Section 5. 

Theorem 2. Given a pair of spatial conjunctions E, E' and a stack s such that 
s \= well-formed(E) , we have that: 

— the procedure match(s, E, E, E') always terminates with a result U , 

— the execution requires 0(n) recursive steps, where n = \E\ + \E'\. 

— if s \= U then the entailment U A E — >• E' is valid, and 

— ifs^U then s ^ E ^ E' . 

The main prove function, which checks whether 77 A E — > 77' A E' is valid, 
begins with the pure formula r := II Awell-formed(E) . An SMT solver iteratively 
finds models for 7 1 , which become candidate stack models to guide the search for a 
proof or a counterexample. Given one such stack s, the match function is called to 
check the validity of the entailment with respect to s. If successful, match returns 
a formula U generalising the conditions in which the entailment is valid, so the 
search may continue for stacks where U does not hold. The iterations proceed 
until either all possible stacks have been discarded, or a counterexample is found 
in the process. It is important to stress that the function does not enumerate all 
concrete models but, rather, the equivalence classes returned by match. Formally 
we state the following theorem, whose proof is given in Section 5. 

Theorem 3. Given two pure formulas II, II', and two spatial formulas E, E' , 
we have that: 

— the procedure prove(II A E — > II' A E') always terminates, and 

— the return value corresponds to the validity of II A E — > II' A E' . 

5 Proofs of correctness 

This section presents the main technical contribution of the paper, the proof of 
correctness of our entailment checking algorithm. The proof itself closely follows 
the structure of the previous section, filling in the technical details required to as- 
sert the statements of Theorem 1, on well-formedness, Theorem 2, on matching, 
and finally Theorem 3 on entailment checking. 

Well-formedness Soundness of the well- formed condition well-formed(E) , the 
first half of Theorem 1, can be easily shown by noting that if a spatial con- 
junction E is satisfiable with respect to some stack and a heap, the formula 
well-formed(E) is also necessarily true with respect to the same stack. 
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Proposition 4. Given a spatial conjunction E, a stack s, and a heap h, if we 
have s,h \= E, then also s \= well-formed(E) . 

Proof. Let £ = Si* ■ ■ ■ * S n . Since s, h (= E, there is a partition h = h\*- ■ -*h n 
such that each s, hi \= Si. Given a pair of predicates Si and Sj with i < j, if 
either s \= empty(Si) or s \= empty(Sj), then trivially s \= -icollide(Si, Sj). 

Assume otherwise that s |= ^empty(Si) A ^empty(Sj). It follows that both 
s(addr(Si)) G dom/i^ and s(addr(Sj)) G dom/ij. Since by construction hi and hj 
have disjoint domains, we have s(addr(Sij) ^ s(addr(Sj)). This implies the fact 
that s |= ^collide(Si, Sj). □ 

For completeness of the well- formed condition well-formed(E) , the second 
half of Theorem 1, we prove a slightly more general result. In particular we show 
that if a stack s \= well-formed(E) then it is possible to build a heap h such that 
s,h \= E. Furthermore, we show that such h is conservative in the sense that it 
only allocates addresses which are strictly necessary. 

Proposition 5. Given a spatial conjunction E = Si * ■ ■ ■ * S n and a stack s such 
that s \= well-formed(E) , there is a heap h for which s,h \= E and, furthermore, 
the domain domh = {addr(Si) | s \= -i empty '(S ',)} . 

Proof. Consider the heap h = h\ * ■ ■ ■ * h n where each hi is defined as follows: 

— if s \= empty(Si) then hi = 0; otherwise 

— if s \= ^empty(Si) it follows that Si = next(x,y) or Si = \seg(x,y), in either 
case let hi = {s(x) i— > s(y)}. 

By construction s,hi |= Si and, furthermore, if s \= -> empty <(S \, Sj) it follows 
that dom/ij = {s(addr(Si))}. From this we easily get as desired that the domain 
of the heap dom/i = {addr(Si) \ s \= ->empty(Si)}. Now, to prove that s, h \= E, 
we have only left to show that for any pair Si, Sj with i ^ j the domains or 
their respective heaplets are disjoint, i.e. dom/ij n dom hj = 0. 

If either s \= empty(Si) or s \= empty(Sj) the result is trivial. Otherwise 
assume that s \= ^empty(Si) A ^empty(Sj). Since s \= well-formed(E) , and in 
particular also s \= -icollide(Si, Sj), it follows that s ty= addr(Si) ~ addr(Sj). 
Namely the address values s(addr(Si)) ^ s(addr(Sj)) and, thus, the domains of 
hi and hj are disjoint. □ 

Theorem 1 follows immediately as a corollary of Propositions 4 and 5. 

Matching and proving The following proposition is the main ingredient re- 
quired to establish the soundness and completeness of the match procedure of 
Figure 1. The proof, although long and quite technical in details, follows the 
intuitive description given in Section 4 about the behaviour of match. Each of 
the main four cases in the proof corresponds, respectively to the conditions on 
lines 10 and 12, when discarding empty predicates, line 14, when a matching step 
is performed, and finally line 16, when the base case of the recursion is reached. 
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Each case is further divided in two sub-cases, one for the situation when 
the recursive call is successful and a proof of validity is established, and one 
for the situation when a counterexample is built. The last case, the base of the 
recursion, is divided into four sub-cases: the successful case when the matching 
is completed, the case in which all of E' is consumed but there are predicates in 
E left to match, the case in which there is a collision but the enclosure condition 
is not met, and finally the case in which there is no collision at all. 

Proposition 6. Given three spatial formulas E, E, E' , and a stack s such 
that E C E, and s \= well-formed(E) ; let U be the pure formula returned by 
match(s,E,E,E'). 

— If s\=U then U A E ^ (E \ E) * E' is valid and, otherwise 

— if s y= U there is a h such that s, h \/= E — > (E \ E) * E' . 

Proof. The proof goes by induction, following the recursive definition of the 
match function. 

— Suppose we reach line 10, with a predicate S € E such that s |= empty(S). 
Recursively let U' — match(s , E , E \ 5, E') and U = empty(S) A U' . Since 
s \= empty(S) it follows s \= U U'. 

• if s \= U, we want to show that U A E — > (E \ E) * Z" is valid, so take 
any model s', h \= U A S. By induction we know the formula U A S — > R 
is valid, where R = (S \ (E \ S)) * S' = (S \ S) * S * E' . It follows 
therefore follows that s' ,h \= (E \ E) * S * E' . Since s |= empty(S), there 
is nothing allocated in h for S and, thus, s', h |= (E \ E) * E' . 

• if s y= V, by induction there is a heap h such that s, h \= E but, at the 
same time, s,h ^= (E \ E) * S * E' . Again, since s, \= S, it must be the 
case that s, h ^= (E \ E) * E' . (Otherwise you get a contradiction.) 

— Suppose we reach line 12 with a predicate S' € E such that s \= empty(S'). 
Recursively let U' = match(s, E, E, E' \ S') and U = empty(S') A U' . Again 
we have s \= U o V. 

• if s \= U', we want to show that U' AE^>(E\E)*E' is valid, so take any 
model s',h^UAE. By induction we know U A E^ (E\E) * (E'\S') is 
valid and, thus, we also get that s', h \= (E \ E) * (E' \ S'). Again, from 
s' h empty(S') and s', |= S' it follows s', h \= (E\E) * (E' \ S') * S' 
or, equivalently, s' , h |= (E \ E) * E'. 

• if s y= V, by induction there is a heap h such that s,h\= E but, at the 
same time, s,h \f= (E\E)*(E'\S'). Similarly s,0 |= S', so it must be the 
case that s,h ^ (E\E)*(E'\S')*S' or, equivalently, s, h ^ (E\E)*E'. 

— Suppose we reach line 14, with two of predicates S £ E and S' E E', such 
that the stack s \= match- step{E ', S, S'). Let S" = residue{S, S"), recursively 
obtain U' = match(s,E,E\S, {E'\S')*S") and let Z7 = match- step(S) AU' '. 
As before we have s \= U U'. 

• if s \= U, we want to show that U A E — > (E \ E) * E' is valid. That is, 
any model s',h |= U A £ is also a model of (E \ E) * E' . By induction 
we have that U' A E — > R is valid, where the formula 

R={E\(E\S))* (E' \ S') * S" = (E\E)* (E' \S')*S* S" . 
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Since s',h \= U' A E it follows that s' , ft \= R. By Proposition 2, since 
s' \= match-step(E, S, S'),we obtain that s', h \= (E\E) * (E> \ S') * S' 
or, equivalently, s', h\= (E\E) * E'. 

• if s Y= U, by induction, there exists a heap ft such that s, ft |= Z but, 
however, s, ft ^ (Z \ Z) * (Z' \S')*S* S". Partition ft = hi * ft 2 such 
that s, hi \= E\S and s, h 2 \= S. Now note that, regardless of the value 
of S, letting h' 2 — {s(x) M> s(y)} and ft' = hi * h' 2 we have that both 
s,h' 2 ^ S and s, h! |= Z. We claim that s, h' ^ (Z, \Z) * Z'. 
Assume by contradiction that s, h' (= (Z \ Z) * Z', and partition now 
ft' = ft 3 * hi such that s, ft 3 |= (Z\Z) * (Z' \ S") and s, /14 |= 5". Because 
S and 5" collide, it follows that domft 2 = {s(addr(S))} C dom/14 and 
ft 4 = ft 2 *ft 5 for some remainder ft 5 . Then, by Proposition 1, s, /14 |= S*S" 
and s, /15 |= 5". But h = hi * h 2 = ft 3 * h 2 * /15 would make a model of 
(Z * Z) * (Z' \ 5") * S * S", contradicting our inductive hypothesis. 

— Suppose we reach line 16. We can find ourselves in several situations: 

• Z' = 0, Z = 0, and the function returns U = T. In this case it is trivial 
that s \= U and U A Z — ^ (Z \ 0) * is valid. 

• Z' = 0, there is a 5 G E, and the function returns U = _L. In this case 
s |£ [/, so we need to find a counterexample for the entailment. From 
Proposition 5 there is a heap h such that s,h \= E. Partition h = hi * h 2 
such that s, hi \= (Z\Z) and s, h 2 |= Z. Since £ occurs in Z, and at this 
point s y= empty(S), it is necessarily the case that s(addr(S)) E doma- 
in particular h 2 ^ 0, and because h = hi * ft 2 , we obtain s,h ty= (E \ E). 
Furthermore, since Z' = 0, this is equivalent to s, h \£ (Z \ Z) * Z'. 

• There is a S" G Z', a 5 G Z such that s |= collide(S, 5"), and the function 
returns £/ = _L. Since we did not end up on line 14, it must be the case 
that s \£ enclosed(E,S,S'). By Property 5 there is a heap h such that 
s, h |= Z. Partition h = hi * h 2 such that s, fti |= (Z \ S) and s, ft 2 h= S. 
Let ft 2 = s {y)} and h' = hi * ft 2 ; since s |= ^empty(S) we have 
that s, ft 2 |= & an( i s ! ft' 1= 

If it turns out that s, h' \fc (Z \ Z) * Z' we are done. Assume otherwise 
that s, ft' |= (Z \ Z) * Z' and partition the heap ft' = ft 3 * hi such that 
s, h 3 \= (Z \ Z) * (Z' \ S") and s, ft 4 h= Since the predicates 5, S" 
collide and are non-empty, it follows that the address {s(addr(S))} = 
domft 2 C domft4 and, therefore, hi = h' 2 * h§ for some remainder fts. 
By Proposition 1 it follows that s, ft 4 |= S * S" and s,h 5 \= S" . Since 
ft' = ft 3 * h' 2 * ft 5 it follows then that s, ft 3 * /15 \= (E\S). By Proposition 3 
there is a h§ such that s, /13 * ft 5 * ft 6 |= Z but s, /15 * ftg ^= S". However, 
since s,ft 3 f= (Z\Z)*(Z'\S"), it follows that s,ft 3 *ft 5 *ft 6 ^ (Z\Z)*Z'. 
The heap h%* h 5 * ft 6 is a counterexample for the entailment. 

• There is some S' E E' and s ^ collide(S, S') for all S 1 G Z, thus the 
function returns U — _L. By Property 5 there is a heap ft such that 
s, ft |= Z. Partition ft = fti * ft 2 into two parts such that s, hi \= (Z \ Z) 
and s, ft 2 |= X 1 . Since 5" does not collide with any predicate in Z, it 
follows that s(addr(S')) £ domft 2 , in particular s, ft 2 \£ E'. From this it 
follows that s, fti * ft 2 ^ (Z \ Z) * Z'. □ 
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The correctness of the match procedure, formally stated previously in The- 
orem 2, follows as a corollary of this proposition for the case when £ = S. 
Termination of the procedure can also be easily verified since, at the recursive 
calls in lines 10 and 14 the size of the third argument decreases and, when it 
stays the same at the recursive call in line 12, the size of the fourth argument de- 
creases. This same termination argument also shows that the number of recursive 
steps is in fact linear in the size of S and S' . 

Finally we are ready to prove the termination and correctness of the main 
prove procedure as stated earlier in Theorem 3. Specifically, we'll show that the 
procedure returns valid if, and only if, the entailment 77 A S — > 77' A £' supplied 
as argument is indeed valid. 

Proof (of Theorem 3). Termination can be established since at each iteration of 
the loop at line 3, the number satisfying models of r is being strictly reduced. 
Since there is only a finite number of formulas that can be built by combinations 
of empty(S) and match- step(S , S, S') — the building blocks for U — all suitable 
combinations should be exhausted at some point. 

For correctness we now prove that line 3 at the base of the loop always 
satisfies the invariants: 

1. r — > II A well-formed(E) , and 

2. if r A £ -> 77' A £' is valid then also 77 A £ ->• 77' A S' is. 

The first invariant can be easily verified by inspecting the code and noting 
that at the beginning 7 1 = 77 A well-formed(S) , and later only more conjuncts 
are appended to 7A 

For the second invariant, right before entering the loop we have that r = 
77 Awell-formed(S). So, assuming that 77 Awell-formed(S)AS^II' AS' is valid, 
take any s',h \= TI A E, from Proposition 4 it follows that s' \= well-formed(E) 
and therefore, from our assumption, s', h \= W A E' . 

If we enter the code of the loop we have that s \= T and start by letting 
U = match(s, E, E, E'). If s ^ 77' A U, then either we have that s y= 77'— from 
Proposition 5 there is a heap h such that s, h \= II A S but s, h y= 77' — or 
s y= U — in which case from Theorem 2 there is a h such that s, h \= II A S but 
s,h \/= In either case the entailment is invalid and the procedure correctly 
reports this. 

Alternatively if s \= 77' A U, from T A -(77' A U) A £ -> 77' A S' we have to 
prove that 77 A S ->■ 77' A S'. Take any s', h \= 77 A S, if s', h (= 77' A U then from 
Theorem 2 the formula U A £ -> £' is valid, and s',h \= W A S' . Otherwise, if 
s', h y= 77' A U, from our assumption we have as well s', h \= II' A □ 

6 Experiments 

We implemented our entailment checking algorithm in a tool called Asterix us- 
ing Z3 as the theory back-end for testing the satisfaction of pure formulas and 
evaluating expressions against pure stack-models. The tool already accepts ar- 
bitrary theory expressions and assertions as part of the entailment formula. 
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Copies SmallFoot sip Asterix 



1 


0.01 


0.11 


0.17 


2 


0.07 


0.06 


0.19 


3 


1.03 


0.08 


0.23 


4 


9.53 


0.13 


0.26 


5 


55.85 


0.38 


0.31 


6 


245.69 


2.37 


0.39 


7 


(64%) 


20.83 


0.54 


8 


(15%) 


212.17 


0.85 


9 






1.49 


10 






2.81 



Table 1. Running times in seconds while checking 'clones' of SmallFoot exam- 
ples. 



However, due to the current lack of realistic application benchmarks making use 
of such theory features, we only report the running times of this new implemen- 
tation against already published benchmarks from [23] . 

Table 1 shows experiments that have a significant number of repeated spa- 
tial atoms in the entailment. They are particularly difficult for the unfolding 
implemented in sip and the match function in Asterix. Since our match function 
collects constraints that can potentially be useful for other applications of match, 
we observe a significant improvement. 

7 Conclusion 

We have presented a method for extending an SMT solver with separation logic 
using the list segment predicate. Our method decides entailments of the form 
77 A E — > 77' A S' , whose pure and spatial components may freely use arbitrary 
theory assertions and theory expressions, as long as they are supported by the 
back-end SMT solver. Furthermore, we provide a formal proof of correctness of 
the algorithm, as well as a experimental results with an implementation using Z3 
as the theory solver. 
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